A crucial issue of current text generation models is that they often uncontrollably generate factually inconsistent text with respective of their inputs. Limited by the lack of annotated data, existing works in evaluating factual consistency directly transfer the reasoning ability of models trained on other data-rich upstream tasks like question answering (QA) and natural language inference (NLI) without any further adaptation. As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks. To alleviate this problem, we propose a weakly supervised framework that aggregates multiple resources to train a precise and efficient factual metric, namely WeCheck. WeCheck first utilizes a generative model to accurately label a real generated sample by aggregating its weak labels, which are inferred from multiple resources. Then, we train the target metric model with the weak supervision while taking noises into consideration. Comprehensive experiments on a variety of tasks demonstrate the strong performance of WeCheck, which achieves a 3.4\% absolute improvement over previous state-of-the-art methods on TRUE benchmark on average.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Recent advances in operator learning theory have improved our knowledge about learning maps between infinite dimensional spaces. However, for large-scale engineering problems such as concurrent multiscale simulation for mechanical properties, the training cost for the current operator learning methods is very high. The article presents a thorough analysis on the mathematical underpinnings of the operator learning paradigm and proposes a kernel learning method that maps between function spaces. We first provide a survey of modern kernel and operator learning theory, as well as discuss recent results and open problems. From there, the article presents an algorithm to how we can analytically approximate the piecewise constant functions on R for operator learning. This implies the potential feasibility of success of neural operators on clustered functions. Finally, a k-means clustered domain on the basis of a mechanistic response is considered and the Lippmann-Schwinger equation for micro-mechanical homogenization is solved. The article briefly discusses the mathematics of previous kernel learning methods and some preliminary results with those methods. The proposed kernel operator learning method uses graph kernel networks to come up with a mechanistic reduced order method for multiscale homogenization.
translated by 谷歌翻译
随着自我监督学习的快速发展(例如,对比度学习),在医学图像分析中广泛认识到具有大规模图像(即使没有注释)来训练更具概括的AI模型的重要性。但是,大规模收集大规模任务的未注释数据对于单个实验室来说可能具有挑战性。现有的在线资源(例如数字书籍,出版物和搜索引擎)为获取大型图像提供了新的资源。然而,在医疗保健中发布的图像(例如放射学和病理学)由大量的带有子图的复合图组成。为了提取和分离化合物形象为下游学习的可用单个图像,我们提出了一个简单的复合图分离(SIMCFS)框架,而无需使用传统所需的检测边界框注释,并具有新的损失函数和硬案例模拟。我们的技术贡献是四倍:(1)我们引入了一个基于模拟的培训框架,该框架最小化了对资源广泛的边界框注释的需求; (2)我们提出了一种新的侧损失,可针对复合人物分离进行优化; (3)我们提出了一种阶层内图像增强方法来模拟硬病例; (4)据我们所知,这是第一项评估利用复合图像分离的自我监督学习功效的研究。从结果来看,提出的SIMCF在ImageClef 2016复合人物分离数据库上实现了最先进的性能。使用大规模开采数字的预审预革的学习模型通过对比度学习算法提高了下游图像分类任务的准确性。 SIMCF的源代码可在https://github.com/hrlblab/imageseperation上公开获得。
translated by 谷歌翻译
给定的用户输入的自动生成平面图在建筑设计中具有很大的潜力,最近在计算机视觉社区中探索了。但是,大多数现有方法以栅格化图像格式合成平面图,这些图像很难编辑或自定义。在本文中,我们旨在将平面图合成为1-D向量的序列,从而简化用户的互动和设计自定义。为了产生高保真矢量化的平面图,我们提出了一个新颖的两阶段框架,包括草稿阶段和多轮精炼阶段。在第一阶段,我们使用图形卷积网络(GCN)编码用户的房间连接图输入,然后应用自回归变压器网络以生成初始平面图。为了抛光最初的设计并生成更具视觉吸引力的平面图,我们进一步提出了一个由GCN和变压器网络组成的新颖的全景精炼网络(PRN)。 PRN将初始生成的序列作为输入,并完善了平面图设计,同时鼓励我们提出的几何损失来鼓励正确的房间连接。我们已经对现实世界平面图数据集进行了广泛的实验,结果表明,我们的方法在不同的设置和评估指标下实现了最先进的性能。
translated by 谷歌翻译
了解舌头和口咽肌肉变形之间的潜在关系在标记的MRI和可理解的语音中起着重要的作用,在推进语音运动控制理论和对语音相关疾病的处理方面起着重要作用。然而,由于它们的异质表示形式,这两种模式之间的直接映射(即二维(中间式切片)加上时间标记的MRI序列及其相应的一维波形)并不简单。取而代之的是,我们诉诸二维频谱图作为中间表示,其中包含音高和共振,从中可以开发一个端到端的深度学习框架,以将标记的MRI序列转换为其相应的音频波形,并具有有限的音频波形数据集大小。〜我们的框架基于一种新颖的完全卷积不对称翻译器,并具有自我残留注意策略的指导,以专门利用语音期间的移动肌肉结构。潜在的空间表示解散策略。〜此外,我们将一种对抗性训练方法与生成的对抗网络结合在一起,以在我们生成的频谱图上提供改进的现实主义。我们的框架使一系列标记的序列可以生成清晰的音频波形。 MRI,超过竞争方法。因此,我们的框架为帮助更好地了解两种方式之间的关系提供了巨大的潜力。
translated by 谷歌翻译
良好的善解人意对话系统应首先跟踪并理解用户的情绪,然后以适当的情感回复。但是,目前对此任务的方法要么集中于提高对用户情绪的理解或提出更好的反应策略,而且很少有作品同时考虑这两种工作。我们的工作试图填补这一空缺。受到任务导向对话系统的启发,我们提出了一种具有情感感知对话管理的新颖善解人意的响应生成模型。情绪感知对话管理包含两个部分:(1)情绪状态跟踪保持当前用户的情绪状态,(2)善解人意的对话策略选择预测目标情绪和用户的意图,基于情绪状态跟踪的结果。然后,预测信息用于指导响应的产生。实验结果表明,与自动评估和人类评估下的几个基准相比,动态管理不同的信息可以帮助模型产生更多的移情反应。
translated by 谷歌翻译
古代定居点的检测是景观考古学的关键。传统上,通过行人调查确定了定居点,因为研究人员在物理上穿过景观和记录的结算位置。最近,古老遗骸的手动识别和标签在卫星图像上增加了考古数据收集的规模,但该过程仍然耗时耗时和艰巨。自我监督学习的发展(例如,对比学习)在使用未标记的卫星和历史空中图像定位考古地点提供可扩展的学习方案。然而,考古站点仅以整个景观的一部分出现,而现代对比监督的学习方法通​​常会在高度平衡的数据集中产生较差的性能,例如使用卫星图像在大面积上识别稀疏局部古城区化。在这项工作中,我们提出了一个解决这个长尾问题的框架。与通常分别处理标记和未标记数据的现有对比学习方法相反,所提出的方法在半监督环境下改革学习范例,以充分利用宝贵的注释数据(我们的设置中<7%)。具体地,通过在未unnotated图像斑块之间的相似性和注释的锚图像之间的相似性来形成数据的高度不平衡性质,以形成伪负对的先验知识。在这项研究中,我们使用了95,358个未标记的图像和5,830个标记的图像来解决从长尾卫星图像数据集检测古建筑的问题。从结果中,我们的半监督对比学习模式实现了79.0%的有前途的测试均衡准确性,而最先进的方法的改善是3.8%。
translated by 谷歌翻译
之前在为人类运动提供合理的限制方面发挥着重要作用。以前的作品在不同情况下遵循各种范式的运动前锋,导致缺乏多功能性。在本文中,我们首先总结了先前运动的不可或缺的特性,并因此设计了一种学习多功能运动的框架,其模拟人类运动的固有概率分布。具体地,对于有效的先前表示学习,我们提出了全局方向归一化,以在原始运动数据空间中删除冗余环境信息。此外,将基于序列的基于段的频率引导引入编码阶段。然后,我们采用去噪培训方案以可学习的方式从输入运动数据中解散环境信息,以产生一致和可区分的表示。在三个不同的任务中嵌入我们的运动前嵌入我们的运动,我们进行了广泛的实验,并且定量和定性结果均表现出我们之前运动的多功能性和有效性。我们的型号和代码可在https://github.com/jchenxu/human-motion-porion -prior上获得。
translated by 谷歌翻译
尽管取得了巨大的成功,但深入的学习严重遭受鲁棒性;也就是说,深度神经网络非常容易受到对抗的攻击,即使是最简单的攻击。灵感来自脑科学最近的进步,我们提出了一种新的内部模型(DIM),这是一种基于新的生成自动化器的模型来解决这一挑战。模拟人类大脑中的管道进行视觉信号处理,暗淡采用两级方法。在第一阶段,DIM使用丹组器来减少输入的噪声和尺寸,反映了塔马拉姆的信息预处理。从主视觉皮质中的内存相关迹线的稀疏编码启发,第二阶段产生一组内部模型,一个用于每个类别。我们评估了42次对抗攻击的衰弱,表明Dim有效地防御所有攻击,并且优于整体鲁棒性的SOTA。
translated by 谷歌翻译